Skip to content

Detect local arguments and functions used in a function#164

Merged
samwaseda merged 46 commits intomainfrom
args
Apr 2, 2026
Merged

Detect local arguments and functions used in a function#164
samwaseda merged 46 commits intomainfrom
args

Conversation

@samwaseda
Copy link
Copy Markdown
Member

@samwaseda samwaseda commented Feb 26, 2026

Following the effort in #155, I started trying to parse variables as well. Edits will follow....

Summary

Adds undefined-variable detection to dependency_parser to identify and resolve external symbols used within a function. This enables failing fast with a clear error when a function depends on unversioned external variables or callables, rather than silently producing incorrect provenance.

Related Issue(s)

Backward Compatibility

The get_call_dependencies and find_undefined_variables function signatures have been narrowed: the func_or_var parameter type changed from Callable | object to Callable[..., Any] | type[Any]. Callers passing arbitrary non-callable, non-type objects will now get a type error.

Implementation Notes

  • Added UndefinedVariableVisitor AST visitor that collects used and locally-defined variable names; local (nested) function definitions raise NotImplementedError to fail fast. The function name itself, all argument kinds (posonlyargs, args, kwonlyargs, *args, **kwargs), and class definitions are all tracked in defined_vars to avoid false positives. import and from … import statements inside a function body are collected in .imports / .import_froms.
  • Added find_undefined_variables to extract symbols used but not defined in a function's source and resolve them via object_scope. Built-in names are excluded; source retrieval failures return {} gracefully; unparseable source (e.g. SyntaxError) also returns {} gracefully.
  • Reworked get_call_dependencies to discover dependencies from undefined variables rather than call-graph traversal; recursion into unversioned dependencies is guarded by a callable(obj) or isinstance(obj, type) type guard so non-callable values are recorded but not recursed into.
  • Fixed mypy errors: added from typing import Any, narrowed func_or_var type annotations to Callable[..., Any] | type[Any], and added a type guard before the recursive call to satisfy static analysis.
  • Added comprehensive unit tests (20 tests, up from 6) achieving 100% line coverage of dependency_parser.py. Tests cover all new components: UndefinedVariableVisitor (function name tracking, class def tracking, import collection, top-level async def, nested local function/async function raising NotImplementedError), find_undefined_variables (builtins excluded, graceful failure for uninspectable callables, graceful failure on SyntaxError, argument exclusion), and get_call_dependencies (empty result, cycle detection, versioned dependency collection, recursion into unversioned callables, non-callable values not recursed). Tests use only CI-available packages (pydantic, pyiron_snippets).

@github-actions
Copy link
Copy Markdown

Binder 👈 Launch a binder notebook on branch pyiron/flowrep/args

@samwaseda samwaseda changed the base branch from main to crawler February 26, 2026 08:36
@samwaseda samwaseda marked this pull request as draft February 26, 2026 08:37
@samwaseda
Copy link
Copy Markdown
Member Author

@liamhuber I started looking into the possibility of including variables as well. Frankly I'm also not really convinced but I still would like to explore the possibility

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.81%. Comparing base (2b96eac) to head (d2073d2).
⚠️ Report is 47 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##              main     #164      +/-   ##
===========================================
- Coverage   100.00%   98.81%   -1.19%     
===========================================
  Files           30       30              
  Lines         1990     2022      +32     
===========================================
+ Hits          1990     1998       +8     
- Misses           0       24      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@samwaseda
Copy link
Copy Markdown
Member Author

I just had a discussion with @XzzX and we think we can keep the current implementation is maybe not quite as useless as I had imagined, because with this implementation it is now possible to throw an error as soon as an external variable is used.

@liamhuber
Copy link
Copy Markdown
Member

because with this implementation it is now possible to throw an error as soon as an external variable is used.

Failing cleanly is infinitely better than looking like everything is well-provenanced when it isn't. However, I still think that restricting "ok locally defined stuff" to mean "don't use any symbols outside your function definition" is going to turn out to be a more annoying, more restrictive, more-work solution than telling people "you'll have to use well-versioned code to contribute to the KG". As an intermediate step towards catching and leveraging these external symbols, this still seems like solid progress though.

Base automatically changed from crawler to main February 27, 2026 07:20
@samwaseda
Copy link
Copy Markdown
Member Author

@copilot Resolve the mypy issues

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot Resolve the mypy issues

Fixed in commit 1e85c8b. The two mypy errors were:

  1. inspect.getsource(func_or_var)object is too broad for getsource's expected types
  2. object_scope.get_scope(func_or_var) — expects Callable[..., Any] | type[Any], not object

Changes made to dependency_parser.py:

  • Added from typing import Any
  • Narrowed func_or_var: Callable | objectfunc_or_var: Callable[..., Any] | type[Any] in both find_undefined_variables and get_call_dependencies
  • Added a callable(obj) or isinstance(obj, type) type guard before the recursive get_call_dependencies call so mypy can verify the narrowed type

@samwaseda
Copy link
Copy Markdown
Member Author

@copilot Write unit tests to cover all parts introduced in this PR

@samwaseda
Copy link
Copy Markdown
Member Author

@copilot Write unit tests to cover all parts introduced in this PR

…, and get_call_dependencies

Agent-Logs-Url: https://github.com/pyiron/flowrep/sessions/07c0e92a-bc0a-4207-9ad4-01cc7fe4e409

Co-authored-by: samwaseda <37879103+samwaseda@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot Write unit tests to cover all parts introduced in this PR

Added in commit 9f326f8. The test suite now has 19 tests (up from 6), covering all new parts:

UndefinedVariableVisitor — function name tracked for recursive calls, class def tracking, import/from … import collection, top-level async def accepted without error.

find_undefined_variables — builtins (len, int, …) excluded, graceful {} return when source unavailable (built-in callables), all argument kinds excluded.

get_call_dependencies (had no tests before) — empty result when function uses only its own args, early return when FQN is already in _visited, versioned dependencies stored with correct version info, unversioned callables recursed into, non-callable values recorded but not recursed.

Only packages available in CI (pydantic, pyiron_snippets) are used; all tests use the unittest framework.

@copilot Write unit tests to cover all parts introduced in this PR

Done in commit 9f326f8 — see my reply above for the full breakdown of what was added.

@samwaseda samwaseda requested a review from liamhuber April 1, 2026 20:42
@samwaseda
Copy link
Copy Markdown
Member Author

@copilot cover ALL lines added in this PR in the unit tests

Copilot AI and others added 2 commits April 1, 2026 20:51
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot cover ALL lines added in this PR in the unit tests

Done in commit 641fbaf. Added test_syntax_error_in_source_returns_empty_dict which covers the last uncovered branch (lines 128-130): patching ast.parse to raise SyntaxError verifies that find_undefined_variables returns {} gracefully. dependency_parser.py now has 100% line coverage across all 20 tests.

@samwaseda
Copy link
Copy Markdown
Member Author

@liamhuber I'm gonna merge this one for now, because I think it's good enough for a first version, and as long as it is not actively used for provenance there is no imminent danger. Besides this PR is getting a bit long.

@samwaseda samwaseda merged commit 881ec07 into main Apr 2, 2026
20 of 21 checks passed
@samwaseda samwaseda deleted the args branch April 2, 2026 09:50
@liamhuber
Copy link
Copy Markdown
Member

Sounds good 👍 I'll look over the changes today and just open a fresh PR if there's any little patches I'm keen on.

def _fqns(deps: dependency_parser.CallDependencies) -> set[str]:
return {info.fully_qualified_name for info in deps}
def test_syntax_error_in_source_returns_empty_dict(self):
"""When ``ast.parse`` raises ``SyntaxError``, the result must be ``{}``."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samwaseda, can you explain this to me?

Since find_undefined_variables takes a Callable and not a source code string, shouldn't it be impossible to hit a SyntaxError because the interpreter will already have needed to parse the Callable to make it an object?

If somehow we did hit a syntax error, why would we want to let it fail gracefully? Wouldn't we want to complain to the user that we don't know what the undefined variables might be instead of simply claiming there aren't any?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant commit 777a5582

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my only real question/concern, everything else looks great!


class CallCollector(ast.NodeVisitor):
class UndefinedVariableVisitor(ast.NodeVisitor):
"""AST visitor that collects used and locally-defined variable names.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smells like AI newline docstring 😝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants